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RECURSIVE MARKOV PROCESS 

By Shohei Hidaka 

Japan Advanced Institute of Science and Technology 

A Markov process, which is constructed recursively, arises in 
stochastic games with Markov strategies. In this study, we defined 
a special class of random processes called the recursive Markov pro¬ 
cess, which has infinitely many states but can be expressed in a closed 
form. We derive the characteristic equation which the marginal sta¬ 
tionary distribution of an arbitrary recursive Markov process needs 
to satisfy. 


1. Introduction, y For V € N, we consider the evolution of a random 
variable Xt £ M = {1,2,... , V} over a discrete time step t € Z. We call a 
random process the order Markov, if the transition probability of a state 
at any time step is determined only by its k past states: 

(1.1) P{Xt I Xt-i,Xt-2,...,Xt-k) = P{Xt I Xt-i,Xt-2,...). 

Denote the series of k states by 

:= {Xt,Xt-i,.. .,Xt_k+i) £ 

Applying (1.1) k times, we obtain the transition probability P {Xj:_ -k+l I ^t-k) 
for the first order Markov process over the series of states X^_f^_^^. 

Denote the set of N vectors with real entries by and the set of N x M 
matrices with real entries by . Denote by the N — 1 dimensional 

simplex on which an arbitrary probability vector 6 £ satisfies 1^0 = 1 
and each of its element satisfies {6)i > 0. Denote a simplex matrix by Q G 
gWxM each of the M column vectors of the matrix Q £ is a 

simplex vector in . For a transition probability | there 

is a corresponding transition matrix Q £ ^ This is stationary if the 

probability distribution over X^Z^, satisfies P{X^_f,_^_^) = P{Xfzf,) for any t. 
The stationary probability vector 6 = (P(l), P(2),... , P{N^))'^ £ of a 
Markov process with a transition matrix Q is a root of the equation 

(1.2) e = Qe. 
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In this paper, we consider a class of infinite order Markov processes 
G A/'°° which can be constructed recursively. We call this class the 
recursive Markov process. By constructing a order transition matrix 
Q{k) g g ^ recursive Markov process is 

defined in the limit lim^^oo We will give this construction formally 
in Section 3. This definition of a recursive Markov process is motivated by 
studying stochastic games [9] with players stochastically reasoning according 
to past experience. Recently, this class of stochastic games has been studied 
intensively in game theoretic studies from both theoretical and behavioral 
points of view [1, 6, 8, 3, 2, 7]. We perform an analysis based on a recursive 
Markov process for a stochastic game in Section 4. 

As the main result of this paper we prove that, for an arbitrary recur¬ 
sive Markov process, the marginal stationary distribution oj G holds an 
analogous form to (1.2): 

(1.3) u = Q{uj)uj, 

where the transition matrix Q{uj) G is a function of uj. 

In Section 2, we introduce our notation, and formulate a Markov pro¬ 
cess in a linear algebraic form. Introducing three basic operators, we give 
an extension of the transition matrix, called a shift matrix. Our analysis 
on the shift matrix illustrates the general properties which any stationary 
distribution satisfies. In Section 3, we define the recursive Markov process, 
and show the main result (1.3). In Section 4, we give an application of a 
recursive Markov process for a stochastic game. 

2. Markov process . 

2.1. Transition matrix. For each i, let us write Xi ^ M and X = 
(Ai, A 2 ,..., Xfc) G For the order Markov process (1.1), we as¬ 

sign each state in an integer in the set := {1, 2,... , N^} by the 
indexing map h]\[^k ■ Z defined by 

k 

hN,k (x) ■.= l + J2iXj-l)NT 

i=i 

This encoding of the states is done without loss of generality and is used 
throughout this paper. 

For i G C 7 v,A:, denote the set of integers by 

'hLN,k{i) '■= |^A,fc(-’^i) : i = ^A,fc (^X^ I . 
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This set consists of the N indices of those states in the fc^'^-order Markov 
process which can be reached from the state i. First we define the transition 
matrix G for the fc^'^-order Markov process with respect to this 

encoding as follows. 

Definition 2.1 (Transition matrix). The transition matrix G 
for the k^^-order Markov process is defined by 

(QW),,:=F(/,.A(i)PAO)). 

where {Q)ij is the {i,j) element of the matrix Q. Observe that, unless i G 

nN,k{j), . = 0 . 


In order to analyze the properties of the transition matrix let us 

introduce the vectors and matrices as follows. Let us denote the zero vector 
by Otv = (0,0,... ,0)^ G the identity matrix by Ej\f G and the 

unit vector by 

Cat,! := ^0,... ,0,1,0,... ,0^ G S^, 

and let := eN,ieJj^. We define a special permutation matrix called the 
commutation matrix [5] by: 

m 

Cn,m ■— ^ ^ ^m,i ® ® ^m,i- 

i=l 

where (8> denotes the Kronecker product. 

For the state i G C^r ^ of a k^^ order Markov process with G 
and its corresponding indices mi < ... < mj\f G 'HN,k{i), we define 

qf'f ■= [P{mi I f),..., P{mN \ i))^ G S^, 


and for i G we define the simplex matrix 



(k) 

i-l)+l’ • • • ’ ^N{i-l)+N 


G 


Using the above notation, Hidaka and his colleagues [4] showed an arbitrary 
^th Qpjjgj. transition matrix can be decomposed as: 


i&Cisr,k-i 


= Cn,n^-^ 
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2.2. Linear operators. In this section, we introduce three linear operators 
in their matrix forms to show the basic properties of the order transition 
matrix serving as the foundation of our main result. 


Definition 2.2. For 0 < m < k, we define the k^^ order marginalization 
matrix 

:= ENm ® G 

For 0 < m < A: and the tuple of vectors := we define 

the k^^ order branching matrix 

{qP) ■■= E e 

ieCjv ,m 5 — m 


For 0 < m < k, we define the cycling matrix 


*— C]\Jm ^]\jk — 7n — ^ ^ ® E]\Jm ej\fk — m^j^. 

'^^C-N ,k — m 


Let each element of the vector G be an arbitrary stochastic vector 
consisting of the probability P {Xi,X 2 , ■ ■ ■, Xk) for hj\f^k iXi,X 2 , ■ ■ ■, X^) G 
CN,k- Let := {qP, • • •, qpPj be the tuple of simplex vectors, such that 

the vector , ■ ■ ■, {^pPj ^ consists of the conditional probability 

P {Y I Xi,X 2 ,..., Xk). Then, the three types of matrices introduced above 
correspond to the operators on the stochastic vector as follows. 

1. Marginalization G F(Xi,.. .,Xm,Xm+ 2 , ■■■,Xk). 

2. Branching 0^ G F (Xi,...,T,X^+i,..., 

3. Cycling G F (X^+i, X^+ 2 ,..., X^, Xi,..., X^) 

Figure 1 illustrates the branching and marginalization matrices for N = 2 
and A: = 1,2, 3. 

The reader can confirm the properties of these operators above by finding 
the following identities. For an arbitrary tuple of simplex vectors Qjy 
0 < m,n < k, we have the identity 

Mik) = 

m ^m—n n ^n—m ’ 


7\//(A:)7Vf(A:+l) 


MlpMppP forn > m 
otherwise. 
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(X,) (0)(i) 


branching 

fid) 


(a;x^i)(0,0) (0,1) (1,0) (1,1) 




Eig 1. Branching and marginalization matrix for N = 2 and fc = 1, 2, 3. 


and 


{qP) = (qSJ^) 

For an arbitrary integer m, m, 


modfc(nm) 



m 


Now it is easy to understand that the order transition matrix 
(2.1), which is the “shift” operator P (X^) —>• P can be written 

with the three matrices as follows. 


Proposition 2.1 (Transition matrix as the shift operator). Denote an 
arbitrary transition matrix by ^ corresponding tuple 

of vectors := in, •) . Then we have 

Q{k) ^ (Qa ) = Cf . 

Proof. 

^ I™. 

*£Civ,fc-i jSCjv.i 

i&Cisr^k-i 
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□ 


2.3. k-shift matrix. Let us introduce a /c-shift operator, which is an ex¬ 
tension of Proposition 2.1 as follows. 

Definition 2.3. For a series of tuples ..., define a k-shifting 
transition matrix: 

s ..., := . 

We can easily see that the transition matrix is identical to the 1-shift 
matrix, = S by this definition. The A:-shift matrix is written in 

the recursive form as follows. 


Lemma 2.1 (Recursive property of the marginal matrix). Given tuples 
of vectors := • • •, £ gNxN’^ m = 0,1,..., A:, we can 

write the corresponding k-shifting transition matrix in a recursive form: 





J^m-l 

i=l 



m—1 


where for 1 < m < k and 1 < i < N 

Am) 7v(m+i) (m, 

and gf := In for 1 < i < NK 


) 77(m+l) (m) 


Proof. Observe the recurrent relationship between 

5 (qSJ^) = ® gf) 0 

i=l 

and 

5 qSJ^) = [qP) 

^k-2 

C/vfe-2^j ® g* <8>ejYfe-2^j 

i=l 

where Qf o-nj QPQN(i-i)-ej' 1 < m < A:, find the recur¬ 
sive relationship between S (^Qp , Qp^ and S (^Qp\ • • •, Qp^ by 

inductively writing Qp := Ylf=i ® Qp\PPl)+j- □ 
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2.4. Marginal stationary distribution. As we often need the marginal dis¬ 
tribution rather than the full stationary distribution, it is crucial to describe 
the property of the marginal stationary distribution. With the fc-shift matrix 
dehned in the previous section, we can now analyze a general property of 
an arbitrary marginal stationary distribution. Before stating the marginal 
stationary distribution, let us note that an arbitrary stationary vector is 
uniquely expressed with branching matrices as follows. 

Proposition 2.2. For every 9 G , there is a unique tuple of vectors 
0^^ := (9^\ ..., 6^^2') for m = 0,1,... ,k — 1, which holds 

0 = B'td {b'M'’) Bfs? (bV) ■ ■ ■ 4°’ (e«) ■ 


Proof. Given 9^^'i G for i G CN,k -2 and j G Cn,i dehne for i G 

CN,k-l 

■ -A .= 


m=l 


2+m 


and 


q{k) 

W(i-l)+j 


:= 


Af(i—l)+j+l 


odO 

5 • • • 5 ^N(i— 


N{i—l)+j+N 


- 1 ) 


N{i-l)+j 


gE^. 


Apply this definition recursively for — 1,...,0. 


□ 


Now we are ready to state the lemma as follows. 

Lemma 2.2. According to Proposition 2.2, with 0^^ := (0i^\ ..., 

V / 

for m = 0,1,... ,k, write 9 = (® 7 V^) ^ ■ For a 

transition matrix G ^ which holds 9 = Q^^'>9, we have 

g(0) ^ 0 (^) 0 (O) ^ 0 ( 2 ) 51 ( 0 ) = _ _ _ = 0 (^) 5 )(O) ^ 

where := <S (0(i),..., 0(”")) andQ^^^ := <S (0(i),..., 0('=), Q(^)) . 
Proof. Define ior 1 <m <k 
:= 

and denote := M^g and := Then we have 

^ = m222-i and ©("^“^^^(o) = M^^9. 
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As 6 = Q6, we have 0^”^ ^^ 00 ) = 0 (™-)^(o) for 1 < m < /c, as 

For m = 1, we have □ 


Lemma 2.2 implies the necessary condition that the marginal stationary 
vector holds. As each of these conditions includes a part of the full 
stationary distribution in the term we still need to know the full sta¬ 

tionary distribution to calculate its marginal distribution. This requirement, 
however, can be relaxed when we have a converging series of stationary vec¬ 
tors ^ as A; —>■ oo. This is formally stated by the following 

theorem which replaces the stationary distribution with its corresponding 
transition matrices for the condition of the marginal vector. 

Theorem 2.1. Suppose that, for each integer k > t), we have a order 
transition matrix corresponding tuples of vectors 

■ Denote the stationary vectors by 9^^^ G which 

holds 9^^'> = and denote the marginal stationary vector by ujm'^ := 

... Mf G Then we have 

- *5 (qST^ • • • > Q^n) 

if we have the convergence 

lim =Oj^k. 

fc^OO 


Proof. By Proposition 2.2, for m > 1 we can uniquely write the 
order stationary vector 

0M = (eSi) ■ ■ • (©r^) e 


where ©i™) := ..., 

corresponding with define 



for n = 0,1,..., m. With the tuple 

:=5(0f\...,0f,QSJ)). 


Given ujn^ = by Lemma 2.2, the convergence lim^^oo ~ 

Aff ^ 0 (^+ 1 ) = Oj^k implies 


hm gf-5 

k^oo 


p. (fe—1) 


0 


k-l ’ 


^(fc—1) 



— ^N,N- 


Applying this recursively, we obtain the theorem. 


□ 


RECURSIVE MARKOV PROCESS 


9 


3. Recursive Markov process . Theorem 2.1 implies that the marginal 

stationary vector ujm'^ is obtained by the fc-shift matrix S ..., 

in the limit k —)• oo. This theorem motivates us to consider a special class of 
Markov processes with a converging series of transition matrices in a certain 
form as follows. 

Definition 3.1 (Recursive Markov process). We call a order Markov 
process with the transition matrix recursive, if each element of the block 
matrix is a function of the elements of qf^ G for 1 < m < k, 

i G CN,m- 

This definition of the recursive Markov process is motivated by the fact 
that we can analyze the convergence of such a series of transition matrices in 
a closed form. The following corollary states that this class is characterized 
by a closed-form equation of the marginal stationary distribution. 

Corollary 3.1. Suppose there is a map / : i-A with which 

a infinite order recursive Markov process satisfies = f for 

m = 1,2,... and i € CN,m- Denote the fixed point u € for the linear 
transformation f{oj), which satisfies uj = f{co)uj. Then, the marginal sta¬ 
tionary vector of the k^^ order stationary vector 6^^^ in the limit k ^ oo 
corresponds with ui as follows: 

UJ = lim Mf ^ G 

k—^oo 

if the limit shift matrix Q := limfc^oo<S ■ ■ ■, recursive 

Markov process is irreducible. 

Proof. Denote Q = {qi,... ,q]sf) G , According to Theorem 2.1, 

the marginal stationary distribution uj holds uj = Quj, and the recursive 
Markov process holds qi = Qqi = f{qi)qi for i G Cn,i- As Q is irreduceable, 
UJ = qi and uj = f{uj)uj. □ 

4. Application . As a minimal example which motivates the recursive 
Markov process, we provide an analysis of a two-armed bandit problem in 
this section. Consider a gambler with two options, betting on either arm, 
denoted as 0 or 1. By betting one dollar on either arm at each step, he 
may win one dollar or lose his dollar. Suppose that two arms 0 and 1 have 
constant winning rates po and pi, respectively, which the gambler does not 
know beforehand. This gambler has a sufficiently large number of dollars, 
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and wishes to find out the arm with the certain best winning rate in the 
long term. 

As an example, let us consider the betting strategy with the quantitative 
confidence level {qo,qi) € qo,qi > 0. At every step, the gambler bets on 
arm 0 with probability qo/{qo + qi). Given the confidence level {qo,qi), the 
gambler updates his confidence level to (q’o^jQ'i) with a multiplier A > 1 if 
he wins with arm 0; otherwise, set {qQ/A,qi). This is similar for qi, when 
he chooses arm 1. 

We can view this strategy either as a Hrst order Markov process with 
infinitely many states or as a recursive Markov process. For this problem, 
analysis of the marginal stationary distribution of the recursive Markov pro¬ 
cess is sufficient as we are only interested in whether this gambler may end 
up betting on the best-wining-rate arm almost every time. 

According to Corollary 3.1, this random process, if it has a stationary 
distribution, is characterized by the equation 



P 0<?0 

<?o+gi 

Pn<?o 

90+91 

P191 


90+91 
Pi 91 
k 90+91 


\ 

/ 


Po9oA 

Po9o/A 

P090 

P090 

goA+qi 

Po9oA 

go/A+gi 

PqQo/A 

(J0+91A 

Po90 

90+91/A 
Po90 

go^+gi 

Pi9i 

9o7A+9i 

Pi 91 

(J0+91A 
PI 91A 

90+91/A 
P191/A 

goA+gi 
Pi 91 

go/A+gi 
Pi 91 

(J0+91A 

Pi 91A 

90+91 /A 
Pi9i/A 

goA+gi 

go/A+gi 

(J0+91A 

90+91/A 


/ P090 

90+91 
Po90 

90+91 

P191 

90+91 
Pi 91 
k 90+91 


\ 

/ 


where Pj := 1 — pi for i = 0,1. Without loss of generality, we can reduce 
(4.1) as follows: 


1 A ^ A/(goA + gi) l/{qo + qiA) \ / poqo + Piqi \ 

k 1 / k ViQo^ + qi) A/(go + giA) J V PoQo + piqi ) 


Solving this equation for given pQ,pi, A, we have the unique solution 

^ ^ Api - Pi 

qi Apo - Po ’ 

if the right hand side is positive. This implies that this strategy successfully 
converges to the desired choice for sufficiently large A —oo: 


hm ® 
A^oo 


oo for Po > Pi 
1 for Po = Pi 
0 for Po < Pi 


This simple case study demonstrates how powerful the analysis of a Markov 
process can be if it is recursive. 
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